On Structural Properties of MDPs that Bound Loss Due to Shallow Planning

نویسندگان

  • Nan Jiang
  • Satinder P. Singh
  • Ambuj Tewari
چکیده

Planning in MDPs often uses a smaller planning horizon than specified in the problem to save computational expense at the risk of a loss due to suboptimal plans. Jiang et al. [2015b] recently showed that smaller than specified planning horizons can in fact be beneficial in cases where the MDP model is learned from data and therefore not accurate. In this paper, we consider planning with accurate models and investigate structural properties of MDPs that bound the loss incurred by using smaller than specified planning horizons. We identify a number of structural parameters some of which depend on the reward function alone, some on the transition dynamics alone, and some that depend on the interaction between rewards and transition dynamics. We provide planning loss bounds in terms of these structural parameters and, in some cases, also show tightness of the upper bounds. Empirical results with randomly generated MDPs are used to validate qualitative properties of our theoretical bounds for shallow planning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigation of structural, morphological and dynamic mechanical properties of unvulcanized PDMS/silica compound

In this study, the interaction between the silica filler and polydimethylsiloxanes (PDMS) was investigated from the aspects of the bound rubber and morphological characterization. With special attention to the dynamic properties, the dynamic test was conducted by dynamic shear rheometer. The results show that the modified fillers disperse uniformly within PDMS matrix without aggregation and con...

متن کامل

Identification of Dynamic Damping Properties of a Flexible Structural Adhesive

In this paper dynamic damping properties of a nominated flexible structural adhesive have been identified using an extended-direct modal based joint identification method. It has been revealed that damping characteristics of adhesive are correlated to both frequency and mode shape. Young’s and shear moduli increase with frequency but damping on the other hand, decrease. The results showed that ...

متن کامل

Investigations on structural and electrical properties of Cadmium Zinc Sulfide thin films

Nowadays, II – IV group semiconductor thin films have attracted considerable attention from the research community because of their wide range of application in the fabrication of solar cells and other opto-electronic devices. Cadmium zinc sulfide (Zn-CdS) thin films were grown by chemical bath deposition (CBD) technique. X-ray diffraction (XRD) is used to analyze the structure and crystallite ...

متن کامل

Investigations on structural and electrical properties of Cadmium Zinc Sulfide thin films

Nowadays, II – IV group semiconductor thin films have attracted considerable attention from the research community because of their wide range of application in the fabrication of solar cells and other opto-electronic devices. Cadmium zinc sulfide (Zn-CdS) thin films were grown by chemical bath deposition (CBD) technique. X-ray diffraction (XRD) is used to analyze the structure and crystallite ...

متن کامل

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016